home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sys.amiga.programmer
- Path: news.shlink.de!wiloyee!chaos
- From: chaos@wiloyee.shnet.org
- Subject: Re: disable datacache to gain speed , was Demo/game to OS friendly part II
- X-Newsreader: TIN [version 1.2 PL2]
- Organization: Studentenhochhaus Wedel Deutschland
- Message-ID: <DLxyF9.oDD@wiloyee.shnet.org>
- References: <john.hendrikx.48xr@grafix.xs4all.nl>
- Date: Mon, 29 Jan 1996 11:54:45 GMT
-
- John Hendrikx (john.hendrikx@grafix.xs4all.nl) wrote:
-
- : That's interesting, on my setup however the Movem loop is not the fastest way
- : to copy (but that's not the issue). The most interesting result from my test
- : however was that turning off DataCache on my 68030/22 had a dramatic increase
- : in copying performance:
-
- : Testresults - Mar 21 1995
-
- : Using a Movem-loop 200990 bytes/frame (PAL machine, 12 registers used,
- : unrolled 3 times)
- : (9.58 MB/sec, AIBB tells me 9.5-9.8 MB/sec)
-
- : Using a 48 times unrolled Move.l-loop 212353 bytes/frame
- : (10.1 MB/sec)
-
- : Now with datacache off (!):
-
- : Using a 48 times unrolled Move.l-loop 242323 bytes/frame
- : (11.6 MB/sec)
-
- : The Movem loop wasn't the most optimized version possible, but I doubt it would
- : have made much difference.
-
- : (All tests done on a 68030/22 MHz, 60ns 32-bit FastRAM. Source and destination
- : were not overlapping. During tests interrupts and all DMA was disabled)
-
- : Can anybody explain what is happening with the DataCaching? According to my
- : tests ChipRAM access also becomes faster with DataCache off.
-
-
- i have a similar problem. on my a4000/40 i can increase the speed of all
- routines with random memory access by disabling the datacache. with random
- memory access i mean things like texture mapping, rotating zoomers, etc.
- that work on big textures, on a big memory region, but in a way that makes
- caching allomost impossible.
-
- here the solution is simple. the 68040 will ALLWAYS try a burst access,
- meaning to read 4 longwords. but in the situation of a texturemapper with
- large texture, 3 of them get thrown away. the 68040 does not allow you to
- switch of the burst mode, you only can switch of the entire cache. if there
- is no burst mode possible (as in the 4000/40), the cpu will read all 4
- longword without burst. the result is, that it has to do 4 longword accesses
- every time i want to read a single byte.
-
- all my demos (yes, i am a kewl c0d3r) contain a routine that is called for
- this effects, that checks the cpu and sets 68030 to burst off and 68040 to
- cache off (this is an advantage even for real burst systems). well, on 68060
- my demos crash...
-
- in your situation it might be something different, but you may try to switch
- of the burstmode and mesure again.
-
- another problem might be that read and write region overlap in the cache. as
- far as i knowm the 256 byte cache has for pages, leaving 64 bytes. if source
- and destination are about 64 bytes apart, writing may throuw out reading
- with a 1 to 4 chance (68040 uses a random generator to select the page). if
- you wrote something like this
-
- source ds.b $10000
- dest ds.b $10000
-
- change it to
-
- source ds.b $10000
- ds.b $20
- dest ds.b $10000
-
- if this solves your problem, then you know what it was.
-
- on the other hand, using 14 registers @ 4 bytes makes 58 bytes. this is
- allomost an entire cache page. to check this out properly, try exaclty 32
- bytes on cache page boundary.
-
- all this does not lead to a better routine, but to more understanding where
- the problem comes from.
-
- please mail me for results.
-
-
-